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Abrami et al. (this issue) provide a review of e-learning in Canada from 2000 onwards by 
synthesizing information drawn from multiple sources, not only primary research. In total, 
there were 726 documents included in our review: 235 views expressed in the public 
printed media (an expression of general public opinion); ISlviews from trade/practitioner 
perspectives; 88 views of policy-makers contained in public policy documents; 120 
sources of evidence contained in reviews of research; and 152 sources of evidence 
contained in primary empirical studies. As far as we know, this is the first review of its kind 
to be as inclusive of sources. This is both our review's greatest strength, allowing us to 
determine whether consensus existed among myriad sources, and potentially its greatest 
weakness, as both the time for such a comprehensive review (our contract with CCL 
specified the review needed to be completed in 90 days) and the cost meant we sacrificed 
a degree of depth for an increase in breadth. 

We reached a number of conclusions in our review, a few of which bear repeating: 

• Conclusions from Canadian primary research, international literature reviews, policy documents, 
media reports, and practitioner publications are mostly favourable towards the use and impact of 
e-learning (i.e., student achievement, motivation, and other outcomes) in Canada. 

• In Canada there is a lack of evidence in some theme areas, notably early childhood learning, 
and a lack of experimental and quasi-experimental evidence that would allow unambiguous 
causal conclusions to be drawn about effectiveness. 

• The quality and scope of the research evidence does not match the time, cost and resources 
that have been and will be dedicated to the development and implementation of e-learning. 

• There is a need for programs of development for new initiatives that have high-quality research 
and evaluation programs or components built-in as a forethought, not an afterthought. 

• And we expressed important limitations, some of which also bear repeating: 

• There is an unanswered question concerning the emphasis which is placed on deployment, 
attendant costs, and what one might take away, or not add, given the expense of an e-learning 
delivered curriculum. 

• There are promising areas of new development focusing on specific applications of technology 
such as: learning objects and repositories; standardization of metadata; electronic portfolios; 
and broadband enabled lifelong learning projects. 

• We emphasized Canadian primary research to determine the nature and extent of evidence in 
our country. Our ability to compare Canadian evidence with research from other countries was 
limited and indirect. We included only literature reviews of non-Canadian primary research. 

• Because of the scope of this undertaking, its novelty, and the time constraints under which we 
operated, we have been able to provide only a rough portrait of the evidence and opinions. 



• While we incorporated a large number of explanatory variables or study features, we are certain 
that a finer analysis of the literature would yield far more. 

• We did not examine the evidence from a theoretical perspective, in part because of the time 
limits, but primarily because there is little in the way of theory-testing research on e-learning 
that can be synthesized. 

• There were methodological challenges and shortcomings to this review and to our use of an 
argument catalogue to synthesize views on e-learning. 

• Our analyses of evidence, including the primary evidence, are based primarily on frequency 
analyses or vote counts of impacts without regard to the methodological quality of the evidence. 

After we completed our review for the Canadian Council on Learning, we contacted Michele 
Jacobsen, the Editor of the Canadian Journal of Learning and Technology, not only to learn 
whether the Journal was interested in publishing our work but also to see whether the 
paper merited enough attention to request comments from Canadian experts to which we 
could reply. We appreciate the willingness of the journal to publish our work and we are 
especially grateful to Terry Anderson, Margaret Haughey, Heather Kanuka and Richard 
Schwier for examining our work in such detail. While we do not agree on several key 
points, we believe this printed dialogue does much to improve our thinking and to advance 
understanding amongst us all. 

In our rejoinder to the commentaries, we explore several issues: a) the nature and 
importance of systematic reviews; b) answering questions about what works and why; and 
c) what is e-learning and what are its impacts. In doing so, we do not address every point 
raised in the commentaries, some of which we agree with, some of which are also covered 
in our review, and some of which we disagree with. 

The nature and importance of systematic reviews 

In her commentary, Kanuka (this issue) notes that: 

Missing in the review on achievement are research findings which have revealed students infrequently engage in the 
communicative processes that comprise critical discourse—an essential component of achievement as it relates to 
higher levels of learning (see for examples: Aviv, Zippy, Ravid & Geva, 2003; Bonk & Cunningham, 1998; Bullen, 

1999; Davis & Rouzie, 2002; De Laat, 2001; Garrison, Anderson & Archer, 2001; Gunawardena, Carabajal & Lowe, 

2001; Gunawardena, Lowe & Anderson, 1997; Jeong, 2004; Kanuka, 2005; Kanuka & Anderson, 1998; Lopez-Islas, 

2001; McKlin, Harmon, Evans & Jones, 2002; McLaughlin & Luca, 2000; Meyer, 2003; Nussbaum, Hartley, Sinatra, 
Reynolds & Bendixen, 2002; Pawan, Paulus, Yalcin & Chang, 2003; Pena-Shaff, 2005; Pena-Shaff, Martin, & Gay, 

2001; Pena-Shaff & Nicholls, 2004; Rourke, 2005; Rovai & Bamum, 2003; Thomas, 2002; Vaughan & Garrison, 

2005; Veerman, Andriessen, & Kanselaar, 2000; Wilson, et ah, 2003; Yakimovicz & Murphy, 1995). Research 
conducted by Angeli, Valanides, and Bonk (2003) is representative of many of these studies’ conclusions: “students 
primarily share personal experiences amongst themselves, and their responses appeared to be subj ective and naive at 
times. Students’ discourse was also extremely conversational and opinionated and showed little evidence of critical 
thinking” (p. 40). It has been difficult for most of us concerned with e-leaming in higher education to ignore these 
disappointing results—and, yet, these findings have not been reflected in the team’s review of the literature, (p. 88). 


Kanuka's conclusion may or not be correct, but it provides us an opportunity to examine 
critically how she reached it. We do so not to address a criticism at our colleague but 
instead to underscore the importance of describing and using systematic procedures when 
reviewing research. 

Is the evidence complete? The evidence and the citations given in support of Kanuka's 
conclusion represent a limited review of the literature on the relationship between 



students' online connnnunication and critical discourse, posited as an aspect of higher-order 
learning and critical thinking. Because there are nnany studies cited which appear to reach 
this result, Kanuka concludes that these results are firnniy held, innplying that there are 
few, if any, courses, students, and contexts where the findings would be otherwise. This is 
a fornn of review by vote counting that we will connnnent on later in our rejoinder. At first 
glance, Kanuka seenns to ignore the research on Scardennalia and Bereiter's Knowledge 
Forunn (1996) and Feenberg's Textweaver (n.d.), which we nnentioned at the end of our 
review, annong others. We wonder whether other studies showing positive results have 
also been excluded. For exannple, very recently a call for subnnissions to Contemporary 
Educational Psychology on Collaborative Discourse, Argunnentation and Learning states: " 
There is a snnall but growing body of evidence that collaborative student discourse (i.e., 
reflective discussions annong students about acadennic content) can pronnote deep and 
nneaningful learning and enhance students' reasoning skills" (2006). In fact, Kanuuka 
(2005) reports that her own action-based research showed "evidence of higher levels of 
learning resulted fronn [synchronous online discussion]" (Brainstornning section, ^ 5). 

The importance of systematicity. A systennatic review attennpts to be objective, 
repeatable, and transparent by avoiding the subjective and idiosyncratic weaving together 
of one's innpressions of evidence. Our worry about nnany narrative reviews which do not 
follow systennatic procedures is the increased probability of divergent interpretations of 
different collections of evidence leading to confusion and frustration annong researchers, 
policy-nnakers and practitioners. 

So in the case of the systennatic reviews of the prinnary research that we have conducted 
and reported on (e.g., Bernard et al. 2004), and by extension used in the Argunnent 
Catalogue, we are careful to lay out the steps we followed in the conduct of the review and 
to explain carefully how we proceeded at each step. These review steps include: a) 
identifying and explaining a core question; b) systennatically searching the literature; c) 
articulating inclusion and exclusion criteria: d) extracting key indicators of effect or 
outconne; e) coding study features; and f) sunnnnarizing key findings and exploring 
variability annong the results. In a nneta-analysis, step d) involves extracting effect sizes 
(i.e., quantitative indicators of standardized nnean difference between a treatnnent and a 
control group), but they nnay take other fornns in other kinds of reviews as they did in our 
use of the Argunnent Catalogue. Therefore, we contend that, regardless of whether a 
quantitative review, a qualitative (narrative) review, or an Argunnent Catalogue is being 
conducted, these steps are innportant to follow because systennaticity is an innportant 
aspect of objectivity, which also nnaxinnizes inclusiveness and fairness and nnininnizes bias. 

In the case of Kanuka's selection of references, we see that the precise nneaning of her 
core question is unclear, there is no evidence of any attennpt to systennatically search the 
literature, and there is nothing to describe why studies are included or what studies are 
excluded fronn consideration. In fact, there is a nnisnnatch between the studies in Kanuka's 
citations and the inclusion criteria we ennployed. Sonne of her studies were conducted prior 
to 2000, sonne are not Canadian in origin, and so on. It is not surprising, then, that her 



conclusions on this point are at odds with ours. Table 1 sunnnnarizes whether each of 
Kanuka's 28 studies should have been included in our review and the reasons why nnany of 
thenn were excluded. It appears that, in fact, we did not nniss a large collection of studies 
using our stated inclusion criteria. Had we not explicitly described our inclusion/exclusion 
criteria, we could not draw this conclusion with any certainty. 


Table 1. Twmn-eight studies on onlme iearnmg and critical discourse 


Citation 

Comment* 

Angeli, Valanides & Bonk 2003 

3 

Aviv, Zippy, Ravid & Ge\'a, 2003 

3 (notin 
English) 

Bonk Sc Cunnin^am, 199S 

2 

Bullen, 1999 

2 

Davis Sc Rouzie, 2002 

3 

De Laat, 2001 

3 <S:4 

Garrison, Anderson Sc Archer, 2001 

5 

Gunawardena, Carabajal Sc Lowe, 2001 

5 

Gunawardena, Lowe Sc Anderson, 1997 

2 

Jeong, 2004 

3 

Kanuka, 2005 

1 

Kanuka Sc Anderson, 199S 

2 

Lopez-Islas, 2001 

3 

McKlin, Harmon, Evans Sc Jones, 2002 

3 

1 McLaughlin Sc Luca, 2000 

3 

Meyer, 2003 

3 

Nussbaum, Hartley, Sinatra, Re>Tiolds Sc BendLxen, 2002 

3 

Pawan, Paulus, Yalcin Sc Chang, 2003 

3 

Pena-Shaff, 2005 

3 

Pena-Shaff, Martin Sc Gay, 2001 

3 

Pena-Shaff Sc Nicholls, 2004 

3 

Rourke, 2005 

6 

Rovai Sc :^mum, 2003 

1 

Thomas, 2002 

3 

Vaughan Sc Garrison, 2005 

7 

Veerman, Andriessen, Sc Kanselaar, 2000 

3 

Wilson, et al. 2003 

1 

Yakimovicz Sc Murphy, 1995 

2 


* Legend 

1 . Included in Abrami et al. 

2. Study was published prior to 2000 (not in the inclusion criteria). 

3 . Study was conducted outside Canada (not in the inclusion criteria). 
















































4. Study was conducted in the workplace (not in the CCL mandate). 

5. Study was not indexed as Canadian and marked asa 141 (Reports-Descriptive) 
in ERIC (search strategv* only retrieved 142 (Reports-Evaluative) and 143 
(Reports -Research). 

6. This dissertation should have been retrieved (search strategy,’ did not use the 
term “distance learning,” but instead “Distance Education” and a variety of 
related terms). 

7. This journal article should have been retrieved, but it is not listed in ERIC. It 
can be found in Google Scholar but we only searched this database for reviews, 
not ptimaiy studies given the large nirmber of hits that would have been 
retrieved on this topic. 


Restricting ourselves to these 28 studies, we need to ask whether any critical discourse 
was evidenced in the studies of online e-learning and how variable was this evidence? Is 
the nnagnitude of the relationship between online learning and discourse unifornniy zero, or 
always very low? Or do aspects of instructional design, course level, student 
characteristics and context affect the relationship? These are the sorts of questions that 
any review, but especially a systennatic review, should strive to answer. 

Answering questions about what works and why 

Kanuka's (this issue) critique of our review nnentioned that we did not explicitly state our 
"biases and undeclared assunnptions," which nevertheless ennerged in our treatnnent of the 
topic. In reply, we acknowledge our positivistic philosophical approach but rennind readers 
that we specifically avoided a restrictive approach to the synthesis of evidence and views 
on e-learning. We did not linnit ourselves to only prinnary quantitative studies but also 
included qualitative ones. More generally, the Argunnent Catalogue was an integration of 
five sources of evidence and views and not just prinnary research evidence. We believe the 
Argunnent Catalogue has the potential to beconne a new, inclusive standard for 
incorporating nnultiple perspectives on an issue. In sonne ways we are disappointed that we 
did not connnnunicate well enough the value and innportance of this approach to reviewing, 
where we not only included nnultiple sources but attennpted to represent the key findings 
fronn these sources in both quantitative ternns (i.e., vote counts and effect sizes) and 
qualitative ternns (i.e., narrative descriptions). 

The rennainder of this section is intended to address our philosophical and nnethodological 
concerns, and to elaborate our considerations with regard to these innportant nnatters. 

About positivism. Fronn a positivist perspective, nnany scientific questions have at their 
core a concern about the (causal) relationship between two variables—often described as 
the relationship between an independent (predictor) variable and a dependent (outconne) 
variable. This is true in the natural sciences as well as the hunnan sciences and is evident 
in the question that drives our review—what do we know about the outconnes or innpacts of 
e-learning. We also believe this core concern is of interest to nnany policy-nnakers, 
practitioners, and the general public. This interest is not restricted to the United States or 
the What Works Clearinghouse but is evident in other countries (e.g., the Evidence 
Network and the EPPI Centre in the UK, the Nordic and Japanese Cannpbell Collaboration 


Centres) and internationally (e.g., the Cochrane Collaboration and the Campbell 
Collaboration). In Canada, we have the Canadian Cochrane Collaboration and the CCL who 
are interested in what works and the promotion of evidence-based practice (see for 
example: http://search ■ccl-cca.ca/CCL/Newsroom/Backarounders/?Lanauaae = EN ^ 

What Haughey (this issue), Kanuka (this issue) and Anderson (this issue) challenge is the 
epistemic tradition that underlies a positivist perspective to scientific inquiry. This has 
been an ongoing controversy in educational and social science research for years and 
unfortunately shows little sign of abating. Their challenge is especially relevant to aspects 
of our Argument Catalogue that resemble a quantitative review of empirical evidence; their 
challenge is less relevant to the narrative portions. In a recent defence of field 
experiments in distance education, Abrami and Bernard (2006) examined various forms of 
this challenge and explained the value and importance of educational experiments. 

The purpose of Abrami and Bernard (2006) was to describe the range of issues involved in 
experimentation, to explain the importance of field experiments—what we can learn and 
not learn—and to discuss how the quality of such research can be improved so as to 
strengthen recommendations for "what works" in our field. We considered arguments for 
and against field experimentation, drawing heavily on a recent working paper by Cook 
(2004). 

There is still a large contingent of researchers who maintain that the world is too complex 
and messy for experiments, and that notions of causality are too difficult to establish, to 
base all evidence for practice and policy making on "simplistic" positivistic approaches to 
empiricism. By contrast, there exists a strong resurgence of interest in randomized control 
trials (RCTs) as "the gold standard" of evidence, which should direct the course of future 
educational development—what works in education. While acknowledging the value of non- 
quantitative forms of research for evaluation, exploration, and hypothesis generation, we 
maintain that there are good reasons to continue with forms of field experimentation in 
education and when appropriate questions are being asked. Abrami and Bernard (2006) 
critically examine five arguments against experimentation: philosophical arguments; 
practical arguments; arguments about undesirable trade-offs; arguments that educational 
institutions will not use experimental results; and arguments that experiments are not 
necessary because better alternatives exist. 

For example, philosophical arguments are designed to show that experiments: 1) cannot 
provide unbiased tests of causal hypotheses, and 2) are predicated on a descriptive theory 
of causation that is less useful than explanatory theories of cause. 

Kuhn (1970) among others has argued that experiments are biased by the researcher's 
hopes, opinions, and expectations, thus undermining their neutrality and claims concerning 
truth. However, this criticism is equally applicable to qualitative research methods. Yet, 
this is no reason to dismiss experimentation out of hand. Many claims that are now known 
with a great degree of certainty (e.g., the effects of time on task) were initially established 
through experimentation. Not all evidence is subjective evidence, and not all evidence is 



context-bound. 


The second philosophical argunnent, one that Haughey (this issue) nnakes explicitly, 
nnaintains that experinnents are predicated on an overly sinnplistic theory of causation, 
testing only the innpact of a snnall subset of possible causes, often only a single one, rather 
than the full connplexity of factors in a systenn of casual influences. For exannple, Haughey 
(this issue) wrote: "Context was so strong an intervening variable that it was innpossible 
to parcel out the use of the new technologies, and the teacher's beliefs about, connfort with 
and pedagogical approaches to ICT were equally innportant and connplex intervening 
variables" (p. 115). 

This is one reason why nnultifactor experinnents are designed, but there are practical linnits 
to what can be nnanipulated and carefully controlled in the field. This view helps explain 
why correlational designs and experinnental-correlational hybrids can be innportant 
adjuncts to sinnple field experinnents. In addition, nnixed-nnethodology research, attennpts 
to connbine the strengths of both approaches, generalizability and exploration of context, 
to bear on research problenns. And the systennatic literature review is an innportant adjunct 
to field experinnentation by providing a connprehensive synthesis of the evidence and 
nneans to test for context effects annong the collection of studies (see below). 

Furthernnore, while experinnents nnay be linnited, they are not useless. There are tinnes 
when it is innportant to know whether a treatnnent or cluster of treatnnents "works," even if 
we don't know the exact cause(s). For exannple, experinnents have helped establish the 
innpact of the positive effects of sunnnner school on achievennent, the value of assigning 
and grading honnework, and so on. It is, innportant to know whether e-learning "works" and 
especially to know the conditions under which e-learning is beneficial, harnnful or of no 
particular value. Such an innportant question requires not only carefully controlled 
longitudinal investigations, but investigations brought to scale exploring the innpact of 
context and sustainability overtinne and circunnstance. 

The importance of integrations. A systennatic review of quantitative evidence (a nneta- 
analysis) has the following advantages: a) it answers questions about effect size; b) it 
systennatically explores the source of variability in effect size; c) it allows for control 
overinternal validity by focusing on connparison studies vs. one-shot case studies; d) it 
nnaxinnizes external validity or generalizability by addressing a large collection of studies; 
e) it innproves statistical power when a large collection of studies is analyzed; f) the effect 
size is weighted by sannple size—large sannple studies have greater weight; g) when a 
review is updated, it allows new studies to be added as they beconne available or studies 
to be deleted as they are judged to be anonnalous; h) it allows new study features and 
outconnes to be added to future analyses as new directions in prinnary research ennerge; i) 
it allows analysis and re-analysis of parts of the dataset for special purposes (e.g., nnilitary 
studies, synchronous vs. asynchronous instruction, web-based instruction); and j) it allows 
connnnent on what we know, what is new, and what we need to know (Abranni, Cohen & 
d'Apollonia, 1988; Bernard & Naidu, 1990). In short, nneta-analysis goes far beyond what a 



single study might ever hope to contribute about a phenomenon and provides a greater 
case for the generalizability of results across populations, materials, and methods. It also 
allows us to explore weaknesses in our research practices and methodologies (e.g., 
Bernard et al. 2004), including the quality and discrimination of our research publication 
outlets. The best research syntheses share the qualities of the best primary investigations. 
They are objective, transparent, precise, and repeatable. 

By adopting the Argument Catalogue approach Abrami et al. (this issue) intended to go 
beyond a typical meta-analysis, while maintaining as much as possible the strengths of 
systematic reviews. In most respects the Argument Catalogue follows the same procedure 
as a meta-analytical review, including: 1) an exhaustive systematic literature search 
oriented towards some broad and yet applied (valuable for scientific, trade or policy 
practice) research question; 2) carefully defined set of comprehensive exclusion criteria 
consistently applied across all studies identified through searches, and 3) a coding system, 
developed through both theoretical analysis and review of a representative sample of 
studies, that is capable of accounting for a variety of reported outcomes and, no less 
importantly, explaining these outcomes in terms of study design and implementation (i.e., 
its methodological, pedagogical, and other related study features). 

What distinguishes the Argument Catalogue from a meta-analysis is that the catalogue 
allows for inclusion of evidence from sources other than quantitative empirical research. An 
Argument Catalogue also attempts to account for the opinions and concerns of other 
parties interested and involved in e-learning—educators (teachers, administrators, and 
policy makers), researchers, students, even the general public presuming that public 
attitudes are adequately reflected in mass media). 

Another difference, and a potential weakness of the Argument Catalogue, is that because 
of the diversity and breadth of sources, the outcomes it considers are defined more 
loosely, and they include not only measured results of some e-learning practices, but also 
their perceived impacts. Nevertheless, we saw the Argument Catalogue as an opportunity 
to enrich our understanding of the issues concerning e-learning by reconciling different 
points of view. Our study also included a quantitative summary of evidence, which 
reported effect sizes for 17 Canadian primary empirical research studies. 

The importance of combining evidence and viewpoints. In our e-learning review, we 
took a broad definition of e-learning predictor variables, which included information and 
communication technologies used to support interactions for learning, and a broad 
definition of impacts (outcome variables) representing seven categories of outcomes 
including: achievement; motivation/satisfaction; interactivity/communication; meeting 
social demands; retention/attrition; learning flexibility; and cost. We also combined these 
outcomes (except for cost) into a single impact factor. Additionally, we distinguished 
between perceived and measured outcomes. The essence of this question, about the 
(causal) relationship between two variables, is a form of "what works" question. Moreover, 
we recognize the importance of distinguishing different forms of e-learning and different 



outcome measures, and this was the central focus for study features coding and analysis. 

However, reviews are hardly necessary, and additional research is not called for, when the 
findings concerning a core question are identical from study to study, or more generally, 
across all sources. It is the inconsistency of findings that require additional research and 
why reviews are undertaken. Reviews are also conducted to answer questions about the 
general impacts of treatments, if any, and the circumstances under which the effects vary. 
So in our review we also explored factors or study features which might explain the 
variability in findings. The essence of this phase of a review, exploring the variability in 
findings, is a form of exploring why things "work" along with the contextual and other 
factors which come into play to moderate the basic relationship between two variables. In 
doing so, we disagree with Anderson's (this issue) claim that it may not be valid to 
combine studies from disparate contexts. This concern about mixing "apples and oranges" 
by combining a range of studies in a systematic review has been around for some time. It 
is in their similarities and differences that we can identify underlying processes and note 
which mechanisms are generalizable and under what circumstances. Otherwise, we are left 
with no hope of ever being able to cumulate our understanding or apply it widely. 

While we are sensitive to the sharp dichotomy of views which characterize the 
methodological paradigm wars between quantitative and qualitative researchers, we 
believe there should be more agreement between the camps about both the purpose (i.e., 
the why) of a single investigation and the value of synthesizing investigations, even as the 
debate about the form (i.e., the how) of inquiry continues. And we also hope that the 
debaters come to see the value of mixed methods of inquiry, like the Argument Catalogue, 
that combine both quantitative and qualitative methods. Finally, we believe that the best 
quantitative studies are well suited for hypothesis testing or confirmatory purposes about 
"what works." In contrast, the best qualitative studies are well suited for hypotheses 
generation or for exploring why things work. 

This is especially the case at the level of the individual study. However, the special 
advantages of a systematic review are that the accumulation of evidence across studies 
also allows for the exploration of process explanations of why things work. It is why we 
believe that a good systematic review does more than cumulate what is known but adds to 
knowledge by exposing consistencies and inconsistencies in findings across contexts and 
exploring why variability exists. 

Strengths and weaknesses of the Argument Catalogue. In a meta-analysis, only 
quantitative studies are cumulated and analysed so that effect sizes are extracted and 
subjected to statistical analyses for heterogeneity and model fitting. Methods like the 
Argument Catalogue are attempts to include a broader range of evidence into the 
systematic review process. Doing so means exploring the consensus of evidence across 
multiple sources by tallying "votes" for each outcome—positive, neutral, or negative. By 
this method, we observe the direction of an effect but not the magnitude of effect. Vote 
counts of evidence have been criticized for favouring large sample studies because, all 



other things equal, a significant effect is easier to find in a large sannple study than a snnall 
sannple study. By extension, a consistent effect observed in a vote count is not the sanne 
as a large effect. For exannple, concluding that the evidence favours a positive effect of e- 
learning is not the sanne as saying the effect is snnall, nnediunn, or large. "How nnuch" 
questions cannot be answered by vote counts. But the cruder vote counting nnetric nneant 
that we were able to include nnultiple sources of evidence rather than only data fronn 
quantitative prinnary studies. In addition, for the snnall nunnber of experinnental (7) and 
quasi-experinnental (10) prinnary studies we located and included, we connputed 29 effect 
sizes for the e-learning connposite nneasure. The nnean effect size was snnall (-1-0.117), but 
the variability annong the effects was heterogeneous. 

Haughey (this issue) suggests: 

For a good literature review we expect a wide gathering of possibly relevant articles, a sifting by type, a review of 
methods, of conclusions and a subsequent grouping by some series of constructs so as to illuminate the reader about 
the topic, the issues which have already been identified by previous researchers, their limitations, and possible 
issues which still need exploration. Does the adding of a rating scale make the process more rigorous? I don’t believe 
so. My hope that we might have found an alternative way to explore the findings of both post-positivist and 
interpretivist or critical studies has not been confirmed by what I read. Perhaps so many of their findings seem 
reasonable because they followed the steps of a good literature review, (p. 116). 

Our review contained a wide gathering of relevant articles, five different types of articles 
including both qualitative and quantitative ennpirical studies, a review of nnethods (e.g., 
highlighting the lack of field experinnents) and of findings, and a subsequent grouping by 
constructs (e.g., the innportance of instructional design). And it did not rely just on a rating 
scale, as we also coded articles qualitatively and sunnnnarized their nnajor nnessages. 

Our teann reviewed all open-ended (ennergent) coding intended to identify, when possible, 
the authors' principal positions, argunnents presented, or conclusions reached, in every 
docunnent. The nnost salient, interesting, infornnative, or powerful nnessage or the nnost 
representative or frequently appearing nnessage in each docunnent was extracted. These 
were sunnnnarized and organized by the source of evidence. 

Nevertheless, in Abranni, Bernard and Wade (2006), we discuss the strengths and 
weaknesses of the Argunnent Catalogue further. In our initial efforts at creating an 
Argunnent Catalogue, we attennpted to strike a balance between the extraordinary breadth 
of the e-learning review and the depth we wanted to achieve in conducting a detailed 
analysis of the docunnents. While we covered five distinct sources of evidence, we did not 
do so connprehensively for all of thenn, choosing to focus especially on Canadian prinnary 
research studies, literature reviews and policy docunnents. 

The Argunnent Catalogue codebook for our e-learning review was developed through an 
ennergent approach by taking a representative sannple of docunnents of various types to 
ensure that the nnajor issues covered by the docunnents were reflected. The docunnents 
fronn each of the five thenne areas were subsequently coded using the connnnon codebook. 
The codebook can be found at the CSLP website at 
<http://doe.concordia.ca/cslp/CanKnow/eLearning.php>. One advantage of a connnnon 



codebook is that it allowed us to analyze the coded information from different resources 
within a single database, while also allowing for different types of literature to be analyzed 
as subsets. An example of this is the capacity to analyze study features (such as 
publication date or technology addressed) across all types of publications, while being able 
to select studies containing quantitative data to extract effect sizes for a meta-analytic 
review. 

Throughout the process of coding information from the different resources, certain issues 
became evident. The more comprehensive the codebook became in order to address the 
different sources, the larger it grew. On one hand, this led to a higher level of overlap and 
interconnection between some codes. On the other hand, some codes were totally 
irrelevant for some resources. For example, all practitioners' articles had missing codes for 
research design, effect sizes, and other features that are only pertinent to primary 
research or reviews. 

In future, we will explore the strengths and weaknesses of a more detailed and in-depth 
analysis of all the sources of evidence. This can be achieved through developing a more 
comprehensive codebook, with additional study features, including sets of features that 
apply specifically to each source of evidence. 

Questions about cost effectiveness and value are worth considering given the time and 
resources needed to conduct a thorough examination of the multiple perspectives of an 
issue. Will a small random sample of media, practitioner and policy documents be a 
legitimate proxy for an inclusive search? Is the quality in the detail or is a cursory 
examination of the global issues sufficient? Answers to questions of this sort may only 
come with experience gained from seeing the extent to which, for example, systematic 
reviews are improved and the impact of both Argument Catalogues and reviews increases. 

Our expertise and experience with the quantitative techniques, often referred to as meta¬ 
analysis, coupled with their popularity, led us to use this approach to systematic reviews 
of research in our initial Argument Catalogue. As the field of systematic reviewing matures, 
other methods are emerging including those that synthesise evidence using qualitative 
techniques, and Dixon-Woods et al. (2005) critically summarise a range of methods 
including narrative summary, thematic analysis, grounded theory, meta-ethnography, 
meta-study, realist synthesis. Miles and Huberman's data analysis techniques, content 
analysis, case survey, qualitative comparative analysis and Bayesian meta-analysis. The 
notions we explore here on synthesising multiple sources, not only quantitative and/or 
qualitative research evidence, can also be extended to these other review techniques. 

In our inaugural Argument Catalogue we did not weight the primary evidence by sample 
size but instead treated each study equally. Similarly, we did not account for the size and 
scope of the documents in aggregating evidence from these sources. A medium-sized 
newspaper article was given the same weight as a lengthy literature review. In future, we 
might consider giving more weight to documents as a function of their scope. 



Our analyses of evidence fronn the various sources were based on frequency or vote 
counts of innpacts. We connbined all sources of evidence, regardless of nnethodological 
quality, in order to have an idea about the consistency of the effect of e-learning as 
reflected by the different sources. Vote counts provide such infornnation about the 
consistency of effects and not their size, but size nnatters for policy decisions, as do 
considerations of cost. Future research should explore both. 

In nnost systennatic reviews, especially quantitative syntheses of prinnary evidence, 
considerable effort is expended in judging the quality of the evidence using a plethora of 
nnethodological criteria often focusing on, but not necessarily linnited to, a study's internal 
validity or the certainty with which causal inferences are likely. We did not apply these 
quality judgnnents as exclusion criteria (given the linnited nunnber of studies which would 
nneet rigorous standards), nor did we apply any quality criteria whatsoever in judging the 
other sources of evidence. We did, however, connpare the prinnary evidence, the literature 
reviews and the conclusions we extracted fronn other sources to identify sinnilarities and 
differences. We understand that a review based on an Argunnent Catalogue gives voice to 
popular perceptions and the Zeitgeist of current views, whether they are fornned carefully 
or carelessly. We believe that the advantages of a connprehensive review such as ours are 
worth the risks and the costs of broad inclusivity. But the final judgnnent lies with readers. 

In addition, we do not know for which areas it is appropriate to undertake an Argunnent 
Catalogue and for which areas it is not. Certain aspects of health and social policy, 
including educational practices, seenn ideal candidates for integrating the argunnents fronn 
diverse constituencies. Other aspects of policy nnay be less ideal, for exannple, regarding 
specific pharnnacological or surgical interventions. In general, an Argunnent Catalogue is 
best ennployed in those situations where knowing what works needs to be tennpered by 
knowing how others perceive what works fronn their differing viewpoints. 

With regard to the applicability of an Argunnent Catalogue approach, we do not yet know 
whether, and to what extent, it will serve its intended purpose: to infornn and engage 
policy nnakers, practitioners and the general public so that nnore evidence is given greater 
credibility and exposure, and is eventually taken up and used. 

We like what Schwier (this issue) suggested about the use of narrative as a 
connplennentary approach to quantitative research as a nneans to provide insight and 
develop research questions. It renninded us that acadennic reviews need to be translated 
into clear, concise language and jargon-free narratives or "stories" that policy-nnakers and 
practitioners can understand. We developed Knowledge Links for this purpose and prepared 
English and French versions for our e-Learning review. See 
< http://doe.concordia.ca/cslp/RA-Thennes CanKnow.php >. 

Finally, our conclusions depend on the quality of our coding, which is dependent on the 
quality of the reports, and the reports are static snapshots of long-ternn connplex and 
dynannic processes. Quantitative reviews are linnited in the nature and the annount of 
infornnation that can be coded, and we have no consensually acceptable techniques for 



producing qualitative reviews (which are also derived from simplified reports). Given the 
complexity of the processes being studied, and the variability of the methods being used in 
primary studies, and especially the other sources of evidence we included, there are 
limitations to what can be coded and what can be concluded. As Bernard et al. (2004) 
lamented, reviews are limited by the quality, comprehensiveness and detail of the 
evidence they synthesize and face the challenges of what we described as the 
"methodological morass." 

What is e-learning and what are its impacts? 

Haughey (this issue) makes an articulate case that the CCL definition of e-learning we 
used in our review is very expansive and raises questions about the variety of possibilities 
it entails. Both Haughey (this issue) and Kanuka (this issue) are concerned that the 
breadth of the definition, and consequently the breadth of our review, mask important 
details because the granularity is of the review is too large. 

In our review, we focused on the impacts of e-learning in general across a range of 
outcomes including achievement, motivation/satisfaction, interactivity/communication, 
meeting social demands, attrition/retention, learning flexibility, a composite impact 
measure, and cost. We also examined the impacts of e-learning for five CCL theme areas 
except the workplace: adult education, early childhood education, elementary/secondary 
education, postsecondary education, and health and learning. And we did not search 
specifically for informal learning environments as suggested by Schwier (this issue). In 
addition, for the data as a whole and also separately for the literature reviews and primary 
Canadian research combined, we looked to see whether there were facets or aspects of e- 
learning related to impacts including: contexts of technology use, technology tools, 
pedagogical uses of technology, and location. 

We agree with Haughey (this issue) that the "larger societal context" should be considered 
whenever the impacts of e-learning interventions are addressed. The broad set of study 
features in our Argument Catalogue was intended to do exactly that, to the extent the 
data reported in the existing literature allowed. Our sincere hope is that researchers 
designing and conducting new studies in the area will pay more attention to investigating, 
and reporting on, the "societal context" in which e-learning events occur.. 

We produced a rough sketch of the evidence or an overall picture of recent Canadian 
evidence along with four other sources. This rough sketch should not be interpreted to 
mean there is uniformity or consistency in the findings. The generally positive impact of e- 
learning does not mean that all applications of e-learning have positive results, all the 
time, and in every context. In fact, our analyses attempted to explore the inconsistencies 
we found. 

Table 8 of Abrami et al. (this issue) presents the summary analyses of perceived e- 
learning impacts on seven outcomes from five source of evidence, with the impact 
measured recoded so that -i-l signifies a positive outcome, 0 a neutral outcome, and -1 a 
negative outcome. For only one of 35 results was the data uniformly positive—general 



public opinion of learning flexibility afforded by e-learning. For all the other results the 
findings indicted sonne degree of variability, and we attennpted to explore this variability 
further by analysing study features. 

Sinnilarly, we did not treat e-learning only as an enconnpassing ternn but also looked at 
specific features. For exannple, we coded for: 1) Context of technology use (i.e., distance 
education, in class, blended, unspecified); 2) Type of tools used (i.e., internet/intranet/on- 
line/web, virtual reality/learning objects/sinnulations, technology integration—connputers 
and software for particular purposes, unspecified); and 3) Intervention type, including: 
Instructional (e.g., drill, practice, tutorials, rennediation); Connnnunicative (e.g., e-nnail, 
ICQ, connputer conferencing, LCD projector); Organizational (e.g., data base, 
spreadsheets, record keeping, lesson plans); Analytical/Progrannnning (e.g., statistics, 
charting, graphing, drafting, robotics); Recreational (e.g., gannes); Expansive (e.g., 
sinnulations, experinnents, exploratory environnnents, brainstornning); Creative (e.g., 
desktop publishing, digital video, digital cannera, scanners, graphics); Expressive (e.g., 
word processing, on-line journal); Evaluative (e.g., assignnnents, portfolio, testing); 
Infornnative (e.g., Internet, CD-ROM); and Unspecified/Missing. 

We concluded in our review that a nnore in-depth analysis is called for. We agree with the 
reviewers on this point, and we appreciate the suggestions they offered for exploring the 
literature further. Flowever, there are two innportant caveats. First, if the collection of 
evidence included in our review is not extended, further analyses will not change the 
general results but nnay only explain the variability in different ways. Second, detailed 
exanninations nnay show that nnuch evidence is nnissing, and fine-grained analyses nnay 
conclude that nnore evidence needs to be collected. 

Kanuka (this issue), for exannple, suggested looking nnore closely at achievennent 
outconnes and contrasting: a) lower-level learning innpacts (i.e., surface learning) and 
higher-level learning innpacts (i.e., deep learning); b) cognitive, affective and psychonnotor 
innpacts (although we did exannine achievennent innpacts and nnotivation/satisfaction); and 
c) disciplinary differences (e.g., language arts, social sciences, nnath and engineering, 
etc.). Sinnilarly, she suggests a nnore detailed conceptualization and fine-grained analysis 
of the outconnes included. 

Kanuka (this issue) is especially concerned about our treatnnent of the 
interactivity/connnnunication outconne suggesting, annong other things, that we should have 
separated interacting with technology or content fronn interacting with peers or the 
instructor. She nnakes a strong case by resorting to research that largely falls outside the 
scope of our investigation. This highlights the previous point we nnade about needing to 
find a balance between breadth and depth in any review. 

Schwier (this issue) citing Downes also has a collection of learning outconnes worthy of 
further exploration, including: how to predict consequences; how to read; how to 
distinguish truth fronn fiction; how to ennpathize; how to be creative; how to connnnunicate 
clearly; how to learn; how to stay healthy; how to value yourself; and how to live 



meaningfully. This list reminds us of CCL's Composite Learning Index and the Conference 
Board of Canada's Learning Skills Profile. All these lists are conceptually rich and broad in 
scope, and it would be wonderful to explore e-learning impacts in these categories. If only 
there was evidence available to do so. 

Similarly, Schwier (this issue) elaborates on our finding concerning the importance of 
instructional design in e-learning. He offers suggestions for new directions for the role of 
instructional designers that may bear fruit in future applications of technology for learning. 

Otherwise, Schwier (this issue) offer ten questions that both primary research as well as 
reviews of research should address. While many of these questions fall outside the purview 
and mandate of our review, and may not be answerable until more research is undertaken, 
they serve an important purpose in pointing the way towards future directions for inquiry 
and synthesis. 

Concluding remarks 

As an overall concern, we are worried that in-depth reviews will become too narrow in 
scope to address larger questions about impact and import, losing sight of the forest for 
the trees. At the same time, we share the concern that too general a picture may mask 
underlying and important variability in impacts. The answer is that no single review, like no 
single investigation, answers all the questions about a topic. And like a good primary 
investigation, a good review should stimulate further investigations and further reviews. 
We hope that our review and the commentaries have done just that. 

Finally, we want to thank again the Editor of the Canadian Journal of Learning and 
Technology for affording this visibility to our work. And we especially appreciate the 
reviewers who in critiquing our review took the task as seriously as they did. We can think 
of no better way of ending our rejoinder than by echoing Anderson's (this issue) point that 
we need to ensure that Canadians "are able to take advantage of this most important 
educational development since the printed text" (p.l07). 
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